Skip to content

perf: fix FindIncludesInContent regression (~18% slower)#29949

Merged
pelikhan merged 3 commits intomainfrom
copilot/fix-regression-find-includes-performance
May 3, 2026
Merged

perf: fix FindIncludesInContent regression (~18% slower)#29949
pelikhan merged 3 commits intomainfrom
copilot/fix-regression-find-includes-performance

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented May 3, 2026

FindIncludesInContent regressed ~17.9% (2,409 → 2,842 ns/op) due to overhead in the per-line iterator and redundant prefix comparisons on non-directive lines.

Changes

  • Single-scan fast path (hasDirectiveMarker) — replaces three sequential strings.Contains calls with one forward pass using SIMD-friendly IndexByte to locate @/{ candidates, then checks prefixes only at those positions.

  • First-byte guard in parseIncludePath — short-circuits before all three HasPrefix comparisons for the common case of regular prose lines:

    if trimmed[0] != '@' && trimmed[0] != '{' {
        return ""
    }
  • Manual IndexByte line scan — replaces strings.Lines (iter.Seq with a yield-function call per line) with a direct strings.IndexByte(remaining, '\n') loop, eliminating per-line call overhead and the trailing \n that strings.Lines includes in each yielded string.

Benchmark (5 s, AMD EPYC): ~194 ns/op → ~174 ns/op (~10% improvement locally; targets recovery of the full regression on the benchmark host).

…-scan fast path

Three targeted optimizations to recover the 17.9% regression:

1. hasDirectiveMarker: single-scan helper using IndexByte (SIMD) to find
   '@' or '{', avoiding three separate Contains passes for the no-include
   fast path. Content is now traversed at most once before the line scan.

2. parseIncludePath: first-byte guard skips all three HasPrefix comparisons
   for the overwhelming majority of regular-text lines that don't start
   with '@' or '{'.

3. findIncludesInContent: manual IndexByte('\n') loop replaces
   strings.Lines (iter.Seq), eliminating the yield-function call overhead
   per line and stripping the trailing '\n' before parseIncludePath sees it.

Benchmark (5 s, AMD EPYC):
  Before: 194 ns/op  →  After: ~174 ns/op  (~10% improvement)

Agent-Logs-Url: https://github.com/github/gh-aw/sessions/7cb7b62b-a88d-4dc4-b5ed-0437bae12238

Co-authored-by: gh-aw-bot <259018956+gh-aw-bot@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix regression in FindIncludesInContent performance perf: fix FindIncludesInContent regression (~18% slower) May 3, 2026
Copilot AI requested a review from gh-aw-bot May 3, 2026 15:39
@pelikhan pelikhan marked this pull request as ready for review May 3, 2026 15:40
Copilot AI review requested due to automatic review settings May 3, 2026 15:40
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Improves performance of include-directive detection/parsing in findIncludesInContent by reducing overhead on the common “no directives” case and by avoiding iterator/yield costs in line scanning.

Changes:

  • Adds a pre-scan (hasDirectiveMarker) to skip line-by-line parsing when no directive markers are present.
  • Replaces strings.Lines iteration with a manual IndexByte('\n') loop to reduce per-line overhead.
  • Adds a first-byte guard in parseIncludePath to avoid unnecessary prefix checks for non-directive lines.
Show a summary per file
File Description
pkg/cli/remove_command.go Optimizes include marker detection and line scanning to recover findIncludesInContent performance.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 1/1 changed files
  • Comments generated: 1

Comment thread pkg/cli/remove_command.go Outdated
Comment on lines +393 to +413
// Find next '@' or '{'—both are rare in workflow prose, so IndexByte skips
// large stretches of uninteresting bytes via SIMD on supported platforms.
atIdx := strings.IndexByte(content[i:], '@')
braceIdx := strings.IndexByte(content[i:], '{')
if atIdx < 0 && braceIdx < 0 {
return false
}
if atIdx >= 0 && (braceIdx < 0 || atIdx <= braceIdx) {
pos := i + atIdx
rest := content[pos:]
if strings.HasPrefix(rest, "@include") || strings.HasPrefix(rest, "@import") {
return true
}
i = pos + 1
} else {
pos := i + braceIdx
if strings.HasPrefix(content[pos:], "{{#import") {
return true
}
i = pos + 1
}
@github-actions github-actions Bot mentioned this pull request May 3, 2026
@pelikhan
Copy link
Copy Markdown
Collaborator

pelikhan commented May 3, 2026

@Copiloy review all comments

@pelikhan
Copy link
Copy Markdown
Collaborator

pelikhan commented May 3, 2026

@copilot review all comments

Fixes the double-scan issue flagged in code review: the previous
implementation called strings.IndexByte twice per loop iteration
(once for '@' and once for '{'), scanning content twice end-to-end
for the no-marker case.

Using strings.IndexAny("@{") finds the next candidate byte in a
single pass, ensuring content is traversed at most once.

Agent-Logs-Url: https://github.com/github/gh-aw/sessions/1cd5dff8-a199-4c49-88ec-f11cbd94064b

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented May 3, 2026

@copilot review all comments

Fixed in b1ae170. The dual strings.IndexByte calls (one for @, one for {) have been replaced with a single strings.IndexAny("@{") call per iteration, so the content is now traversed at most once as described. The comment has been updated to match.

Copilot AI requested a review from pelikhan May 3, 2026 16:06
@pelikhan pelikhan merged commit ce02e0c into main May 3, 2026
@pelikhan pelikhan deleted the copilot/fix-regression-find-includes-performance branch May 3, 2026 16:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[performance] Regression in FindIncludesInContent: 17.9% slower

4 participants